25 research outputs found

    Maxiset point of view for signal detection in inverse problems

    Full text link
    This paper extends the successful maxiset paradigm from function estimation to signal detection in inverse problems. In this context, the maxisets do not have the same shape compared to the classical estimation framework. Nevertheless, we introduce a robustified version of these maxisets, allowing to exhibit tail conditions on the signals of interest. Under this novel paradigm we are able to compare direct and indirect testing procedures

    Hyperbolic Wavelet-Fisz denoising for a model arising in Ultrasound Imaging

    Get PDF
    International audienceWe present an algorithm and its fully data-driven extension for noise reduction in ultrasound imaging. Our proposed method computes the hyperbolic wavelet transform of the image, before applying a multiscale variance stabilization technique, via a Fisz transformation. This adapts the wavelet coefficients statistics to the wavelet thresholding paradigm. The aim of the hyperbolic setting is to recover the image while respecting the anisotropic nature of structural details. The data-driven extension removes the need for any prior knowledge of the noise model parameters by estimating the noise variance using an isotonic Nadaraya-Watson estimator. Experiments on synthetic and real data, and comparisons with other noise reduction methods demonstrate the potential of our method at recovering ultrasound images while preserving tissue details. Finally, we emphasize the noise model we consider by applying our variance estimation procedure on real images

    Topological data analysis of human vowels: Persistent homologies across representation spaces

    Full text link
    Topological Data Analysis (TDA) has been successfully used for various tasks in signal/image processing, from visualization to supervised/unsupervised classification. Often, topological characteristics are obtained from persistent homology theory. The standard TDA pipeline starts from the raw signal data or a representation of it. Then, it consists in building a multiscale topological structure on the top of the data using a pre-specified filtration, and finally to compute the topological signature to be further exploited. The commonly used topological signature is a persistent diagram (or transformations of it). Current research discusses the consequences of the many ways to exploit topological signatures, much less often the choice of the filtration, but to the best of our knowledge, the choice of the representation of a signal has not been the subject of any study yet. This paper attempts to provide some answers on the latter problem. To this end, we collected real audio data and built a comparative study to assess the quality of the discriminant information of the topological signatures extracted from three different representation spaces. Each audio signal is represented as i) an embedding of observed data in a higher dimensional space using Taken's representation, ii) a spectrogram viewed as a surface in a 3D ambient space, iii) the set of spectrogram's zeroes. From vowel audio recordings, we use topological signature for three prediction problems: speaker gender, vowel type, and individual. We show that topologically-augmented random forest improves the Out-of-Bag Error (OOB) over solely based Mel-Frequency Cepstral Coefficients (MFCC) for the last two problems. Our results also suggest that the topological information extracted from different signal representations is complementary, and that spectrogram's zeros offers the best improvement for gender prediction

    Detecting human and non-human vocal productions in large scale audio recordings

    Full text link
    We propose an automatic data processing pipeline to extract vocal productions from large-scale natural audio recordings. Through a series of computational steps (windowing, creation of a noise class, data augmentation, re-sampling, transfer learning, Bayesian optimisation), it automatically trains a neural network for detecting various types of natural vocal productions in a noisy data stream without requiring a large sample of labeled data. We test it on two different data sets, one from a group of Guinea baboons recorded from a primate research center and one from human babies recorded at home. The pipeline trains a model on 72 and 77 minutes of labeled audio recordings, with an accuracy of 94.58% and 99.76%. It is then used to process 443 and 174 hours of natural continuous recordings and it creates two new databases of 38.8 and 35.2 hours, respectively. We discuss the strengths and limitations of this approach that can be applied to any massive audio recording

    Tree-structured wavelets in nonparametric function estimation

    No full text
    Wavelet thresholding methods, especially those which pool information from geometric structures in the coefficient domain, are known to be powerful for nonparametric function estimation. In this thesis, we focus on a family of Tree-Structured Wavelet (TSW) estimators so called Vertical Block Thresholding (VBT) family. For each estimator we provide the maximal functional space (maxiset) for which the quadratic risk reaches a given rate of convergence. We identify the ideal estimator of this family, that is the one associated with the largest maxiset and we emphasize the importance of considering method-dependent threshold values. While it is a current research topic for the VBT family, we address this problem in the similar but simpler context of the nonoverlapping Horizontal Block Thresholding family. We next study the situation where we cannot differentiate wavelet-based estimators because their maxisets are not nested. As a generic solution, we propose to proceed via a combination of these estimators in order to achieve new estimators which perform better in the sense that the involved maxisets contain the union of the previous ones. Finally, we use the relation between TSW and recursive dyadic partitioning to develop a novel method for estimating the spectrum of a stationary process using time series traces recorded from experimental designs. Our procedure estimates the “common” log-spectrum and the variability over the traces (or subjects) using a mixed effects model. Numerical studies and a real data example confirm that the proposed methods perform very well.(STAT 3) -- UCL, 201

    Discussion: Time-threshold maps: Using information from wavelet reconstructions with all threshold values simultaneously

    No full text
    First of all, we would like to congratulate the author for having provided a nice addition to the literature on wavelet thresholding which is a little different from providing ‘‘another more refined choice of the threshold value’’. As he says himself in the Introduction, existing papers on ‘‘threshold selection procedures in the function estimation problem advocate the choice of one single threshold value for each wavelet coefficients, . . . ’’ This approach is also referred to as ‘‘separable’’ or ‘‘diagonal’’ thresholding in the literature. What we can learn from his paper is that trying to optimize this threshold value is perhaps not the only route to go down when it comes to denoising statistical signals with sufficiently interesting local structure: as soon as our interest goes beyond an overall satisfying and uniformly not too bad reconstruction of the signal (i.e. estimation of an underlying somewhat ‘‘smooth’’ function), then it is certainly a good idea to broaden the view on thresholding schemes. This would apply to situations where we want to simultaneously estimate the overall shape of the function and the precise location of some pronounced discontinuities, or to cases where our denoised reconstruction is supposed to give us an idea about some (spatial or temporal) segmentation behind the data-generating process. What we will address in our little discussion will be in the line of thought of this paradigm: while the author contributes by a vector-valued view, a whole (visual) map of reconstructions using different (but still ‘‘diagonal’’-based thresholding) values, a complementary approach could be to change the thresholding rule away from ‘‘separable’’ and use rules built on vectors or blocks of coefficients that share some property of closeness. The size (or structural ‘‘geometry’’) of those rules could then be used as index to compare the performance of such a ‘‘family’’ of threshold rules (rather than threshold values as in this paper here) with one another—and why not use again a mapping over time to visualize this performance and try to learn from the information it carries? We will structure the sequel of our contribution as follows. In a first part, without going into too much detail, we would like to put a few questions to the author and build bridges to some of the applications that he has given towards the end of his paper. In the second larger part, we give a small summary on recent work on ‘‘block thresholding’’ rules: these rules have been advocated not primarily to minimize ‘‘uniform’’ estimation criteria such as minimal mean squared error or even certain minimax risks (as e.g. in the seminal work by Cai (1999) on horizontal block thresholding, using blocks of spatially neighbored coefficients within one scale). They were rather developed in order to improve function estimation in the vicinity of a discontinuity, such as a jump, and this really also visually—cf. Figs. 3.1–3.2. They do this by using ‘‘vertical blocks’’ of coefficients following a certain geometrical structure across wavelet scales, i.e. a tree structure

    Bayesian approach to infer the duration of antibody seropositivity and neutralizing responses to SARS-CoV-2

    No full text
    Estimating the duration of natural immunity induced by SARS-CoV2 infection is crucial in health policy strategies. A patient infected by the SARS-CoV2 quickly produces three antibody isotypes IgM, IgG, and IgA that reveal an infection. In this paper, we use a Bayesian twocomponent mixture of random coefficient model to capture the longitudinal/temporal evolution of antibody levels, as well as viral neutralization on the dataset reported by Seow et al. in [1]. We observe that the more severe the symptoms, the more intense antibodies and immunity responses. And their decline is decelerated with the severity. Moreover, it appears that viral neutralization is best predicted by the level of IgM or IgA antibody, rather than by IgG level. Furthermore, our model is particularly suitable to estimate the Probability of being Out of Detection. Thus, we observe that although antibodies persist for up to 5 months in the plasma, the probability of becoming undetectable exceeds 50% after 3 months

    On the performance of isotropic and hyperbolic wavelet estimators

    No full text
    In this paper, we differentiate between isotropic and hyperbolic wavelet bases in the context of multivariate nonparametric function estimation. The study of the latter leads to new phenomena and non trivial extensions of univariate studies. In this context, we first exhibit the limitations of isotropic wavelet estimators by proving that no isotropic estimator is able to guarantee the reconstruction of a function with anisotropy in an optimal or near optimal way. Second, we show that hyperbolic wavelet estimators are well suited to reconstruct anisotropic functions. In particular, for each considered estimator we focus on the rates at which it can reconstruct functions from anisotropic Besov spaces. We then compute the estimator's maxiset, this is the largest functional spaces over which its risk converges at these rates. Our results furnish novel arguments to understand the primordial role of sparsity and thresholding in multivariate contexts, notably by showing the exposure of linear methods to the curse of dimensionality. Moreover, we propose a block thresholding hyperbolic estimator and show its ability to estimate anisotropic functions at the optimal minimax rate and related, the remaining pertinence of information pooling in high dimensional settings.nrpages: 31status: publishe

    Asymptotic performance of projection estimators in standard and hyperbolic wavelet bases

    Get PDF
    International audienceWe provide a novel treatment of the ability of the standard (wavelet-tensor) and of the hyperbolic (tensor product) wavelet bases to build nonparametric estimators of multivariate functions. First, we give new results about the limitations of wavelet estimators based on the standard wavelet basis regarding their inability to optimally reconstruct functions with anisotropic smoothness. Next, we provide optimal or near optimal rates at which both linear and non-linear hyperbolic wavelet estimators are well-suited to reconstruct functions from anisotropic Besov spaces and subsequently we characterize the set of all the functions that are well reconstructed by these methods with respect to these rates. As a first main result, we furnish novel arguments to understand the primordial role of sparsity and thresholding in multivariate contexts, in particular by showing a stronger exposure of linear methods to the curse of dimensionality. Second, we propose an adaptation of the well known block thresholding method to a hyperbolic wavelet basis and show its ability to estimate functions withanisotropic smoothness at the optimal minimax rate. Therefore, we prove the pertinence of horizontal information pooling even in high dimensional settings. Numerical experiments illustrate the finite samples properties of the studied estimators

    Tree-structured Wavelet Estimation in a Mixed Effects Model for Spectra of Replicated Time Series

    Get PDF
    This paper develops a method for estimating the spectrum of a stationary process using time series traces recorded from experimental designs. Our procedure estimates the “common ” log-spectrum and the variability over the traces (or subjects) using a mixed effects model. We combine the use of spatially adaptive smoothing methods with recursive dyadic partitioning to construct a predictive model. The method is easy to implement and can handle large data sets because is uses the discrete wavelet transform which is computationally efficient. Numerical studies confirm that the proposed method performs very well despite its simplicity. The method is also applied to a multi-subject electroencephalogram data set
    corecore